Microsatellite markers reveal genetic diversity and population structure of Portunus trituberculatus in the Bohai Sea, China

The swimming crab, Portunus trituberculatus, is one of the main aquaculture species in Chinese coastal regions due to its palatability and high economic value. To obtain a better understanding of the genetic diversity of P. trituberculatus in the Bohai Sea, the present study used 40 SSR loci to investigate the genetic diversity and population structure of 420 P. trituberculatus individuals collected from seven populations in the Bohai Sea. Genetic parameters revealed a low level of genetic diversity in the cultured population (SI = 1.374, He = 0.687, and PIC = 0.643) in comparison with wild populations (SI ≥ 1.399, He ≥ 0.692, and PIC ≥ 0.651). The genetic differentiation index (Fst) and gene flow (Nm) ranged from 0.001 to 0.060 (mean: 0.022) and 3.917 to 249.750 (mean: 31.289) respectively, showing a low differentiation among the seven populations of P. trituberculatus. Population structure analysis, phylogenetic tree, and principal component analysis (PCA) demonstrated that the seven groups of P. trituberculatus were divided into four subpopulations (K = 4), but the correlation between genetic structure and geographical distribution was not obvious. These results are expected to provide useful information for the fishery management of wild swimming crabs.


Material and methods
Sample collection and DNA extraction. A total of seven populations were collected from the Bohai Sea ( Fig. 1, Table 1). Six wild populations included Dalian (DL), Huludao (HLD), Qinhuangdao (QHD), Huanghua (HW), Dongying (DY), and Penglai (PL). One cultured population (HC) that was sampled from the national breeding farm of swimming crabs in Huanghua (Hebei, China) came from the Bohai Sea. The claws of all individuals were collected and immediately preserved in 95% ethanol and stored at −20 °C. Genomic DNA was isolated from claw muscle using the TIANamp Marine animal DNA extraction kit (TIANGEN, Beijing, China) following the manufacturer's recommended protocols. After extraction, the quality and concentration of DNA

Data analysis.
Genetic diversity within P. trituberculatus populations was estimated by determining genetic parameters, including the number of alleles (Na), the effective number of alleles (Ne), Shannon's diversity index (SI), observed heterozygosity (Ho) and expected heterozygosity (He) using POPGENE version 1.3 27 . Based on allele frequency, polymorphism information content (PIC) was estimated by PIC-CALC software 28 . Null allele frequencies (Fna) for SSR loci were calculated using GenePOP 29 . P values were calculated for determining Hardy-Weinberg equilibrium (HWE) at each locus with POPGENE version 1.3. Genetic differentiation and variation were inferred using Nei's genetic distance (D) 30 and genetic identity (I) calculated by POPGENE version 1.3 and F-statistics (Fst, Fis) calculated by analysis of molecular variance (AMOVA) with software GenAlEx 6.5 31 through 999 permutations. Gene flow (Nm) was inferred from the formula of Nm = (1 − Fst)/4Fst 32 .
The phylogenetic tree was constructed based on Nei's genetic distance and used to test population grouping as implemented in MEGA7 33 . Principal component analysis (PCA) was carried out using Canoco 4.5 to elucidate genetic relationships within and among P. trituberculatus populations. Based on the 40 polymorphic SSR loci, Bayesian model-based population genetic structure was inferred using STRU CTU RE version 2.3.4 34 . The putative number of populations (K) was set from 1 to 10 with 3 replicate simulations for each K value using 100,000 MCMC (Markov Chain Monte Carlo) iterations after an initial 100,000 burn-in period. With the log probability of data (LnP(D)) and an ad hoc statistic ΔK based on the rate of change in LnP(D) between successive K-values, the structure output was entered into Structure Harvester 35,36 to determine the optimum K value. The best K value was analyzed by CLUMPP 37 and visualized with Distruct 1.1 software 38 .
The mean values of Na, Ne, SI, Ho, He, and PIC of seven P. trituberculatus populations ranged from 5.225 to 5.375, 3.794 to 4.103, 1.374 to 1.449, 0.624 to 0.654, 0.687 to 0.714, and 0.643 to 0.673, respectively (Table 4)   www.nature.com/scientificreports/ Population genetic structure. Genetic structural analysis of the total 420 P. trituberculatus individuals was performed to infer the optimal K value with the ΔK method. When the highest ΔK value was observed, the optimal K value was 4 ( Fig. 2), which indicated that the seven populations were divided into four subpopulations (Fig. 3). The populations of Dalian (DL), Dongying (DY), and Huludao (HLD) formed a subpopulation (blue). Similarly, the populations of Huanghua (HW), Penglai (PL), and Qinhuangdao (QHD) formed another subpopulation (red). In the cultured population (HC), the genetic components of most individuals were homozygous but formed two subpopulations (green and yellow). The phylogenetic tree at the individual level based on Nei's genetic distances provided supplementary evidence that the HC population was scattered in different branches and DY individuals showed group clustering (Fig. 4). The population clustering results showed that the seven populations of Portunus trituberculatus formed two main groups (Fig. 5). Group I included four populations: HC, QHD, PL, and HW. The HC and QHD populations aggregated first, then with PL populations, and finally with HW population. Group II included three populations of HLD, DL, and DY. Overall, DY and HC had the largest genetic distance, which revealed that the genetic structure of P. trituberculatus populations in the Bohai Sea was not significantly related to their geographical distribution. In addition, PCA analysis demonstrated that the first two principal components explained 3.94% (PC1) and 3.68% (PC2) of total variation and could distinguish cultivated individuals from wild populations (Fig. 6). In summary, no obvious geographical distribution pattern was found, which illustrated high genetic mixing and gene flow between individuals of different populations.
Population differentiation and variation. The low differentiation (Fst = 0.001) and high gene flow (Nm = 249.750) were observed between the PL and QHD populations, and the high differentiation (Fst = 0.060) and low gene flow (Nm = 3.917) was observed between the HC and DY populations (Table 5). In addition, Nei's genetic distance (D) and genetic identity (I) showed similar results between HC and DY populations (D = 0.177, I = 0.838) and PL and QHD populations (D = 0.025, I = 0.975) ( Table 6). AMOVA analysis revealed that only 4% of genetic variation was partitioned among populations while 96% of the variation was concentrated within populations (Table 7).

Discussion
Genetic diversity is a crucial criterion in estimating the adaptability of species to changing environments, hence a better understanding of the genetic diversity of species is vital for evaluating population structure and evolutionary dynamics 39 . Genetic diversity is susceptible to artificial selection, genetic drift, migration, and breeding systems 40 and is normally evaluated by genetic parameters such as polymorphism information content  :TCC TTC ACC TCT TCC TCT TTTCT   DX16  F:GAG GCA AGC AAG TTA ACC ATTAG  (GT) 7  110-147  60  R:CTT CCT GGT TAC CTC ATC CTACC   DX19  F:CAC ACT CGT TGC AGA CAC TACTT  (TG) 11  160-217  60  R:CTG TTA CTT ACT CGG TGC TTTGG   ZL05 F  www.nature.com/scientificreports/ (PIC), Shannon's diversity index (SI), and heterozygosity (H). However, expected heterozygosity (He) could better reflect the genetic diversity of species than observed heterozygosity (Ho) 41 . The current study reported PIC values of 40 SSR loci of 0.415 ~ 0.895, indicating the polymorphic nature of the loci and their suitability for assessing genetic diversity in the seven P. trituberculatus populations. Genetic analysis revealed that the genetic diversity of the wild populations (He ≥ 0.692) was higher than that of the cultivated population (He = 0.687), which was consistent with our previous report 42 . A similar result was found in E. sinensis 43 . In general, genetic drift, selection, and inbreeding resulted in low genetic variability in farmed stocks 44 . In addition, many SSR loci significantly deviated from HWE (P < 0.05), which might be attributed to null allele and heterozygote deficiency (Fis > 0). Null alleles might be accounted for insufficient sampling 45 and variation of microsatellite flanking sequence 46 . Loss of heterozygosity might be accounted for migration, artificial selection, and inbreeding 47,48 , which was common in marine species such as Scylla paramamosain 49-51 , Pinctada Table 3. Genetic parameters for 40 SSR loci. Na Number of alleles; Ne Number of effective alleles; SI Shannon's diversity index; Ho Observed heterozygosity; He Expected heterozygosity; PIC Polymorphism information content Fna Frequency of null alleles; Fis fixation index; P Probability of significant deviation from Hardy-Weinberg equilibrium; NS not significant (P > 0.05). *P < 0.05, **P < 0.01.

Loci
Na  54 used ten SSRs to investigate the effect of artificial selection on the genetic structure of two abalone lines and found a loss of heterozygosity (Ho = 0.650 < He = 0.711). These studies indicated the negative impact of heterozygote deficiency on population genetic diversity. Therefore, it is necessary to maintain a high level of genetic diversity in aquatic animals to reduce heterozygous loss and prevent germplasm degradation. In terms of expected heterozygosity, this study showed lower genetic diversity of P. trituberculatus in the Bohai Sea (He = 0.725) than that in the Yellow Sea 47 (He = 0.814) and the East China Sea 55 (He = 0.916), which was consistent with the results revealed by SNP markers 14 . It has been shown that when conducting genetic diversity analysis on aquatic animals, the number of SSR loci should be greater than 20 and the sample size should be greater than 45 56 . The number of loci and sample size in this study meet this standard, indicating the reliable result of low genetic diversity of swimming crabs in the Bohai Sea. Bohai Sea is a semi-enclosed and shallow body of water that limits the dispersal of P. trituberculatus, leading to a decline in genetic diversity 47 . In the SSR investigation of Exopalaemon carinicauda, Zhang et al. 57 suggested that the Binzhou population in the Bohai Sea had the lowest level of genetic diversity, which illustrated that the Bohai Sea might hinder the gene flow. Moreover, marine pollution, aquaculture pollution, and reclamation also reduced genetic diversity 58 . Therefore, Table 4. Genetic diversity indices of seven populations of P. trituberculatus from the Bohai Sea.      www.nature.com/scientificreports/ it is necessary to carry out long-term genetic monitoring of P. trituberculatus in the Bohai Sea for full protection and utilization of the germplasm resources of this species. A stable genetic structure is central to the survival of a species. Its disintegration leads to a reduction or even extinction of the population. Given the economic significance of P. trituberculatus, genetic monitoring of population structure is essential for the development of effective management strategies 13 . The results of the current study established that all P. trituberculatus individuals were divided into four subpopulations (Fig. 2). DY population indicated relatively low gene flow with other populations, which might be related to its geographical location. Dongying is located in the relatively closed Laizhou Bay, which restricts the gene exchange of P. trituberculatus with other populations in the Bohai Sea. The phylogenetic tree proved this result. The individuals from the HC population were located at the different clades in the phylogenetic tree, which illuminated a strong genetic mixing between cultured and wild individuals. It is speculated that the frequent gene flow between cultured and wild populations resulted from releases and artificial breeding by catching wild crabs as parents. For example, different regions shared the juvenile crabs of a full sibling family from the Huanghua farm for artificial breeding and releases, resulting in gene flow between the HC population and different wild populations. Therefore, formulating reasonable management measures is necessary to monitor the impact of the releases on wild populations and maintain the genetic integrity of cultivated populations. However, the phylogenetic tree was quite different from the PCA results, which might be due to the indistinct genetic differentiation and the close genetic distance between individuals. Additionally, the calculation methods between the phylogenetic tree and PCA analysis are different 59,60 . Further research is needed into the reasons for this difference.
The genetic differentiation index (Fst), an essential gauge of genetic differentiation among populations, is crucial to understand genetic relationships. 0 < Fst < 0.05, 0.05 < Fst < 0.15, 0.15 < Fst < 0.25, and Fst > 0.25 showed negligible, moderate, high, and strong genetic differentiation respectively 61 . In this study, HC and DY populations were medium differentiation (Fst = 0.060 > 0.05), which might be related to the geographical location of the two groups. Huanghua and Dongying were located at Bohai Bay and Laizhou Bay on both sides of the Yellow River estuary, respectively. The ecological environment, species distribution, and organic pollution in the Yellow River estuary led to the geographical differences between the two different sea areas 62,63 , which led to the differences in activity scope and habitat preference of P. trituberculatus, and ultimately resulted in high genetic differentiation between the HC and DY populations. In addition, geographic isolation also leads to low gene exchange between cultivated and wild populations compared to wild populations in the open sea, which can be proven by the genetic differentiation index. The average value of Fst between the HC and wild populations was 0.031, and between wild populations was 0.017 (Table 5). Moreover, the average value of gene flow (Nm = 31.289), genetic distance (D = 0.08), and genetic identity (I = 0.924) also demonstrated low genetic differentiation and strong genetic admixture among the seven P. trituberculatus populations.

Conclusions
In summary, this study provided useful insights into the population structure of P. trituberculatus throughout the coastal areas of the Bohai Sea. Forty microsatellite loci revealed a low level of genetic diversity in the seven P. trituberculatus populations in the Bohai Sea. A low level of genetic differentiation and frequent gene flow among these seven populations were revealed, suggesting high genetic connectivity. The structure analysis illustrated four subpopulations, but the clustering pattern was not related to geographical location. To increase the genetic diversity of P. trituberculatus, practical and effective protective measures are expected to be taken to prevent the degeneration of germplasm resources. This study also provides a theoretical basis for selecting parents from different geographical populations during the artificial breeding programs.

Data availability
All data generated or analyzed during this study are included in this article.